Search CORE

11 research outputs found

Mage - Reactive articulatory feature control of HMM-based parametric speech synthesis

Author: Astrinaki Maria
Dutoit Thierry
King Simon
Ling Zhen-Hua
Moinet Alexis
Richmond Korin
Yamagishi Junichi
Publication venue
Publication date: 01/01/2013
Field of study

In this paper, we present the integration of articulatory control into MAGE, a framework for realtime and interactive (reactive) parametric speech synthesis using hidden Markov models (HMMs). MAGE is based on the speech synthesis engine from HTS and uses acoustic features (spectrum and f0) to model and synthesize speech. In this work, we replace the standard acoustic models with models combining acoustic and articulatory features, such as tongue, lips and jaw positions. We then use feature-space-switched articulatory-to-acoustic regression matrices to enable us to control the spectral acoustic features by manipulating the articulatory features. Combining this synthesis model with MAGE allows us to interactively and intuitively modify phones synthesized in real time, for example transforming one phone into another, by controlling the configuration of the articulators in a visual display. Index Terms: speech synthesis, reactive, articulators 1

CiteSeerX

Edinburgh Research Explorer

Gesture Control of HMM-Based Singing Voice Synthesis

Author: Astrinaki Maria
Clark Robert
Oura K.
Veaux Christophe
Yamagishi Junichi
Publication venue
Publication date: 01/08/2013
Field of study

Edinburgh Research Explorer

Reactive accent interpolation through an interactive map application

Author: Astrinaki Maria
d'Alessandro Nicolas
Dutoit Thierry
King Simon
Yamagishi Junichi
Publication venue
Publication date: 01/08/2013
Field of study

Edinburgh Research Explorer

Mage-HMM-based speech synthesis reactively controlled by the articulators

Author: Astrinaki Maria
Dutoit Thierry
King Simon
Ling Zhen-Hua
Moinet Alexis
Richmond Korin
Yamagishi Junichi
Publication venue
Publication date: 01/09/2013
Field of study

Edinburgh Research Explorer

Reactive Statistical Mapping: Towards the Sketching of Performative Control with Data

Author: Astrinaki Maria
Babacan Onur
Barbulescu Adela
Cakmak Huseyin
Dall Rasmus
d’Alessandro Nicolas
Hu Qiong
Hueber Thomas
Huguenin Victor
Kalaycı Emine Sümeyye
Moinet Alexis
Parfait Valentin
Ravet Thierry
Tilmanne Joëlle
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/07/2013
Field of study

Part 1: Fundamental IssuesInternational audienceThis paper presents the results of our participation to the ninth eNTERFACE workshop on multimodal user interfaces. Our target for this workshop was to bring some technologies currently used in speech recognition and synthesis to a new level, i.e. being the core of a new HMM-based mapping system. The idea of statistical mapping has been investigated, more precisely how to use Gaussian Mixture Models and Hidden Markov Models for realtime and reactive generation of new trajectories from inputted labels and for realtime regression in a continuous-to-continuous use case. As a result, we have developed several proofs of concept, including an incremental speech synthesiser, a software for exploring stylistic spaces for gait and facial motion in realtime, a reactive audiovisual laughter and a prototype demonstrating the realtime reconstruction of lower body gait motion strictly from upper body motion, with conservation of the stylistic properties. This project has been the opportunity to formalise HMM-based mapping, integrate various of these innovations into the Mage library and explore the development of a realtime gesture recognition tool

CiteSeerX

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Ανίχνευση παθολογίας φωνής σε πραγματικό χρόνο με χρήση αυτοσυσχέτισης και σύντομες εκτιμήσεις του Jitter

Author: Astrinaki Maria
Αστρινάκη Μαρία Ιωάννη
Publication venue
Publication date: 19/11/2010
Field of study

Η φωνή είναι το αποτέλεσμα του συντονισμού όλου του πνευομονοαναπνευστικού μηχανισμού. Σήμερα, οι παθολογίες φωνής απασχολούν όλο και περισσότερο την κοινωνία, καθώς η φωνή και η ομιλία παίζουν σημαντικό ρόλο σε ορισμένα επαγγέλματα, καθώς επίσης και στη γενική ποιότητα της ζωής του πληθυσμού. Η ανάλυση της φωνής επιτρέπει την ανίχνευση και ταυτοποίηση των ασθενειών του φωνητικού μηχανισμού. Σήμερα, η ταυτοποίηση αυτή πραγματοποιείται από έναν γιατρό εμπειρογνώμονα μέσω κλασσικής ιατρικής (ΩΡΛ) εξέτασης αλλά και με τη χρήση επεμβατικών απεικονιστικών μεθόδων, καθώς και με τη χρήση μη επεμβατικών μεθόδων με βάση την ακουστική ανάλυση του παραγόμενου από τον ασθενή, λόγο. Τα τελευταία χρόνια έχει δοθεί έμφαση στα πρωταρχικά στάδια ανίχνευσης παθολογίας στη φωνή, όπου χρησιμοποιούνται κλασικές μετρήσεις διαταραχής (jitter, shimmer, HNR, κλπ). Πηγαίνοντας ένα βήμα παραπέρα το παρόν έργο έχει ως στόχο να υλοποιήσει και να εφαρμόσει ένα σύστημα ανίχνευσης παθολογίας φωνής σε πραγματικό χρόνο σε συνδυασμό με μια διεπαφή Java.Voice is the result of the coordination of the whole pneumophonoarticulatory apparatus. Voice pathologies have become a social concern, as voice and speech play an important role in certain professions, and in the general population quality of life. The analysis of the voice allows the identification of the diseases of the vocal apparatus and currently is carried out from an expert doctor through methods based on the auditory analysis. In these last years emphasis has been placed in early pathology detection, for which classical perturbation measurements (jitter, shimmer, HNR, etc.) have been used. Going one step ahead the present work is aimed to implement a real time voice pathology detection system, combined with a Java interface

E-Locus

REACTIVE AND CONTINUOUS CONTROL OF HMM-BASED SPEECH SYNTHESIS

Author: Benjamin Picart
Maria Astrinaki
Thierry Dutoit
Thomas Drugman
Publication venue
Publication date
Field of study

In this paper, we present a modified version of HTS, called performative HTS or pHTS. The objective of pHTS is to enhance the control ability and reactivity of HTS. pHTS reduces the phonetic context used for training the models and generates the speech parameters within a 2-label window. Speech waveforms are generated on-the-fly and the models can be reactively modified, impacting the synthesized speech with a delay of only one phoneme. It is shown that HTS and pHTS have comparable output quality. We use this new system to achieve reactive model interpolation and conduct a new test where articulation degree is modified within the sentence. Index Terms — speech synthesis, HTS, reactive control 1

CiteSeerX